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Abstract.  Aided  and  automatic  target  recognition  (Ai/ATR)  capability  is 
a  critical  technology  needed  by  the  military  services  for  modern  combat. 
However,  the  current  level  of  performance  that  is  available  is  largely  de¬ 
ficient  compared  to  the  requirements.  This  is  largely  due  to  the  difficulty 
of  acquiring  targets  in  realistic  environments  but  has  also  been  due  to 
the  difficulty  in  getting  new  concepts  from,  for  example,  the  academic 
community,  due  to  limitations  for  distribution  of  classified  data.  The  diffi¬ 
culty  of  the  performance  required  has  limited  the  fulfillment  of  the  promise 
that  is  so  anticipated  by  the  war  fighter.  We  review  the  metrics,  imagery 
data  bases,  and  sensors  associated  with  Ai/ATR  performance  and  sug¬ 
gest  possible  technical  approaches  that  could  enable  new  advancements 
in  military-relevant  performance.  ©2011  Society  of  Photo-Optical  Instrumentation 
Engineers  (SPIE).  [DOI:  10.1117/1 .3601 879] 

Subject  terms:  Aided  and  automatic  target  recognition;  Aided  and  automatic  target 
recognition  performances;  military  target  acquisitions;  receiver  operator  character¬ 
istics;  clutter,  target  variability. 
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1  Introduction 

This  paper  presents  a  tutorial  on  the  performance  metrics, 
status,  and  prognosis  of  aided/automatic  target  recognition 
(Ai/ATR)  for  those  whom  are  not  close  to  the  military  appli¬ 
cation  of  the  technology,  but  who  may  be  able  to  contribute 
to  its  ultimate  successful  development.  Ai/ATR  is  a  generic 
term  to  describe  automated  processing  functions  carried  out 
on  imaging  sensor  data  in  order  to  perform  operations  rang¬ 
ing  from  simple  cuing  of  a  human  observer  to  complex, 
fully  autonomous  object  acquisition  and  identification.  ATR 
is  fully  autonomous,  such  as,  for  example,  the  terminal  ac¬ 
quisition  phase  of  a  missile  seeker.  However,  aided  target 
recognition  (AiTR)  processing  presents  image  annotations 
to  the  human  observer  to  make  the  final  decision  as  to  the 
importance  and  veracity  of  the  information  generated  and  the 
action  to  be  taken.  In  this  paper,  the  imaging  sensors  that  gen¬ 
erate  the  data  for  the  Ai/ATR  processor  are  platform  centric, 
including  visible  and  electro-optics-infrared  (EO/IR),  3-D 
LADAR,  and  imaging  radar  [e.g.,  synthetic  aperture  radar 
(SAR)].  EO/IR  includes  multi-  and  hyperspectral  imaging. 
Signal  processing  of  data  from  nonimaging  sensors,  such  as 
acoustic,  seismic,  and  magnetic  sensors,  is  not  considered; 
although  these  sensor  outputs  can  be  used  as  cues  in  a  mul¬ 
tisensor  configuration  for  Ai/ATR. 

2  Military  Importance 

Ai/ATR  is  an  extremely  important  technology  for  military 
operations  that  has  not  yet  realized  its  full  tactical  promise. 
A  fully  reliable  Ai/ATR  can  enhance  lethality  and  surviv¬ 
ability  of  the  war  fighter  and  platform.  An  Ai/ATR  operates 
on  sensor  data  in  order  to  process  information  for  decision 
making.  The  primary  value  added  to  a  weapons  system  of  an 
Ai/ATR  is  engagement  timeline  reduction  for  target(s)  acqui¬ 
sition.  The  rapid  acquisition  and  servicing  of  targets  increase 
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lethality  and  survivability  of  the  weapons  platform/soldier. 
Whether  the  tactical  scenario  is  the  onslaught  of  an  array  of 
combat  vehicles  coming  through  the  Fulda  Gap,  which  was 
feared  during  the  Cold  War,  or  the  identification  of  humans 
with  intent  to  kill  in  an  urban  scene,  the  identification  of  the 
threat  for  avoidance  or  engagement  is  paramount  to  survival 
and  threat  neutralization. 

There  are  many  military  scenarios  where  a  reliable 
Ai/ATR  capability  would  provide  an  enormous  capability 
to  the  soldier.  A  rapid  wide-area  search  to  provide  alerts  in 
larger  fields  of  regard  is  the  classical  example  that  has  al¬ 
ways  been  envisioned.  Ai/ATR  can  also  enable  the  overcom¬ 
ing  of  unmanned  air-/ground- vehicle  bandwidth  limitations 
by  selection  for  transmission  of  only  target  information  to 
a  weapons  platform.  A  reliable  onboard  Ai/ATR  would  se¬ 
lect  and  send  only  target  information  back  to  the  unmanned 
air  vehicle  (UAV)  operator  without  the  enormous  data  band¬ 
width  for  transmission  of  the  complete  scene  over  the  flight 
path  from  which  the  operator  must  extract  the  target.  Muni¬ 
tions  precision  targeting  and  lock-on-after-launch  seekers  are 
other  examples  of  fully  autonomous  ATR.  Persistent  surveil¬ 
lance  (PS)  presents  a  first  military  application  opportunity 
for  lower  technically  sophisticated  Ai/ATR  in  that  change 
and  anomaly  detection  are  of  primary  significance.  Things 
that  change  in  a  scene,  or  are  different,  are  of  primary  impor¬ 
tance  in  PS.  Temporal  techniques,  such  as  change-detection 
algorithms  and  moving-target  indication  (MTI)  become  first- 
step  candidate  approaches.  Change  detection  can  be  a  major 
tool  in  improvised  explosive  detection  (IED)  detection.  Dis¬ 
turbed  earth,  where  a  device  has  been  buried,  presents  a 
significantly  different  signature  than  undisturbed  earth.  The 
disturbed  earth  presents  a  much  more  uniform,  blackbody¬ 
like,  spectral  signature  compared  to  the  much  more  structured 
signature  of  undisturbed  soil.1,2 

Extremely  large  coverage  areas,  such  as  that  required  in 
PS  or  for  airborne  detection  of  IEDs  along  a  roadway,  with 
sufficient  resolution  and  update  rate  become  driving  sensor 
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Fig- 1  Generic  Ai/ATR  algorithm  showing  discrimination  functions  processed  on  image. 


parameters.  The  need  for  ground-to-ground  Ai/ATR  in  urban 
environments  is  amplified  due  to  the  huge  fields  of  regard 
(~27T  steradians),  the  shortness  of  timelines,  and  the  need 
to  discriminate  combatant  from  noncombatant.  The  Ai/ATR 
task  difficulty  is  extremely  task  dependent,  and  a  canonical 
data  set  is  always  a  concern  for  training  and  evaluation  in  a 
military  scenario. 

All  three  services  are  engaged  in  research  and  develop¬ 
ment  for  reliable  Ai/ATR  capabilities  for  myriad  combat 
missions.  Army,  Navy,  and  Air  Force  are  pursuing  Ai/ATR 
with  sensor  packages  for  their  respective  platforms  to  do  the 
following:  reconnaissance,  intelligence,  surveillance,  target 
acquisition,  fire  control,  wide-area  search  and  track,  coun¬ 
termine,  and  sensor  fusion.  Change  detection  and  MTI  that 
relates  to  target  disposition  are  also  of  interest.  Army  sen¬ 
sor  assets  typically  emphasize  EO/IR  because  of  sensor  size, 
weight,  and  power  constraints  on  the  platform,  whereas  Navy 
and  Air  Force  tend  to  emphasize  high  range  resolution  and 
SAR  radars  due  to  the  long  stand-off  ranges  associated  with 
ship  and  aircraft  engagement  ranges.  This  paper  focuses  on 
the  extremely  difficult  ground-to-ground  missions  associated 
with  Army  or  Marine  combat.  More  extensive  discussion  of 
sea  and  air  Ai/ATR  missions  can  be  found  in  the  unclassified 
open  literature  at  the  Defense  Technical  Information  Center 
(DTIC).3-8 

There  is  a  whole  hierarchy  of  possible  tasks  that  can  be 
of  interest  for  an  Ai/ATR  algorithm.  The  level  of  discrimina¬ 
tion  can  cover  a  whole  gamut,  from  detection  to  classification 
to  recognition  to  identification.  Definitions  of  these  military 
tasks  for  EO/IR  and  rf/SAR  can  be  found  in  the  literature.9-13 
There  can  be  other  tactical  tasks  that  do  not  fall  neatly  into  this 
hierarchy.  For  example,  target  tracking,  aim  point  selection, 
and  target  prioritization  are  target-engagement  relevant  tasks 
that  can  be  candidates  for  automation  in  the  target-acquisition 
and  fire-control  processes.  When  the  Soviet  Union  was  the 
premier  potential  adversary  for  the  United  States  and  nuclear 
war  was  not  considered  as  an  option,  the  most  important  land 
warfare  conflict  envisioned  was  tank-on-tank  battles.  In  this 
scenario,  the  classic  ATR  task  was  detection  and  recognition 
with  sufficient  detail  to  engage  the  target  with  a  weapon.  To¬ 
day,  one  of  the  most  difficult  tasks  of  interest  is  identification 
of  intent.  Whereas,  in  the  past,  detection  of  a  human  may 
have  been  sufficient,  today  the  soldier  must  also  determine 
the  intent  of  the  human  detected.  Is  the  intent  of  the  detected 


human  hostile?  In  PS  for  situational  awareness,  changes  are 
the  most  important  information  in  order  to  alert  and  bring 
other  sensor  assets  to  bear.  Have  military  significant  assets 
moved  or  are  have  new  ones  appeared?  Although  the  techni¬ 
cal  sophistication  of  Ai/ATR  has  not  progressed  rapidly,  the 
sophistication  of  the  required  performance  from  automated 
sensing  has  increased  significantly. 

A  very  simplified  diagram  of  a  generic  Ai/ATR  algorithm 
is  shown  in  Fig.  1.  The  image  from  the  sensor  is  fed  into 
the  front  end  of  the  processor.  Preprocessing  conditioning 
is  performed.  These  can  be  standard  image-processing  tech¬ 
niques  to  reduce/remove  noise,  perform  image  orientation, 
etc.  Features  are  extracted  so  that  candidate  regions  of  in¬ 
terest  (ROIs)  are  segmented,  anomalies  identified,  and  de¬ 
tections  declared.  Higher  level  features  are  found,  for  ex¬ 
ample,  by  comparing  segmented  regions  to  templates  or 
stored  models  of  targets.  At  this  point,  higher  level  dis¬ 
criminations  may  be  declared.  As  mentioned  earlier,  there 
exists  a  whole  hierarchy  of  potential  target  discriminations. 
For  ground  combat,  examples  of  these  two  class  discrimina¬ 
tions  are  as  follows:  classification  (tracked  versus  wheeled), 
recognition  (truck  versus  tank),  and  identification  (Ml  tank 
versus  T72  tank).  Similar  discriminations  exist  for  air  and 
naval  warfare.  In  recent  years,  higher  level  discrimination 
may  include  “fingerprinting”  when  a  specific  entity  identity 
is  required,  such  as  “that”  vehicle  was  the  one  that  planted 
the  IED.  There  is  an  enormous  array  of  algorithms  that  have 
been  proposed,  implemented  in  hardware,  and  tested  within 
many  Department  of  Defense  (DOD)  services  and  agencies. 
A  selection  of  algorithm  classes  are  statistical,  shape  based 
(template/model),  MTI,  increased  dimensionality  (e.g.,  3-D 
LADAR), 14-16  hyper-/multispectral  (MS/HS), 17-19  and  neu¬ 
ral  nets.  Multisensor  phenomenologies  have  been  tried,  in¬ 
cluding  multisensor,  where  more  than  one  sensor  is  looking 
at  the  same  target;  multilook,  where  one  sensor  gets  several 
looks  at  the  target  from  different  aspects;  and  multimode  fu¬ 
sion,  where  sensors  of  different  modalities  sense  the  target 
(e.g.,  acoustic  and  EO  signals  are  fused).  Many  variations  of 
algorithms  have  been  proposed  and  attempted  in  hardware 
and  software,  and  a  survey  list  of  algorithms  can  be  found  in 
the  literature.20 

In  order  to  illustrate  the  ground-to-ground  Ai/ATR  dif¬ 
ficulty,  Fig.  2  shows  a  representative  set  of  IR  sensor 
scenes  for  the  same  targets  in  each  scene  in  a  variety  of 
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Fig.  2  Typical  IR  scenes  with  targets  unannotated. 


backgrounds.  The  targets  are  a  sedan,  a  pickup  truck,  van, 
and  SUVs.  In  order  to  give  the  reader  a  realistic  feel  for  the 
task  difficulty,  no  annotations  are  given  to  show  the  targets 
in  the  scenes.  The  same  scenes,  with  the  targets  indicated 
by  a  box  superimposed  on  the  scene  that  display  the  ATR 
annotations  for  the  imagery,  will  be  shown  later  in  Fig.  6. 
These  figures  are  also  shown  in  order  to  demonstrate  the 
difficulty  of  the  Ai/ATR  task  with  midwave  IR  thermal  im¬ 
agers.  The  most  prolific  battlefield  sensors  in  the  U.S.  Army 
after  the  human  eyeball  and  night- vision  goggles  are  thermal 
imagers. 

3  Figures  of  Merit 

3.1  Three  Bottom-Line  Figures  of  Merit 

The  three  bottom-line  figures  of  merit  for  Ai/ATR  are  re¬ 
ceiver  operator  characteristics  (ROC)  curves,  confusion  ma¬ 
trices,  and  time.  The  ROC  curves  show  the  relationship  of 
the  algorithm-detection  probability  to  the  false  alarms.  They 
show  how  well  the  ATR  discriminates  real  targets  of  interest 
from  noise  sources  or  background  clutter  objects.  Figure  321 
shows  a  typical  ROC  curve  for  a  developmental  Army  ATR. 
The  different  curves  correspond  to  using  different  numbers 


Fig.  3  Receiver  operator  characteristic  (ROC)  curve. 


of  spectral  bands  in  the  midwave  IR  (MWIR)  and  long- wave 
IR  spectral  regions  using  a  constant  false  alarm  (CFAR)  de¬ 
cision  algorithm.  The  movement  of  the  family  of  curves  to 
the  left  and  higher  indicates  higher  performance. 

Confusion  matrices  show  the  relationship  between  the 
real  target  identity  to  what  the  ATR  called  it.  Higher  level 
discrimination  performance,  such  as  recognition  or  identifi¬ 
cation,  is  displayed  in  the  confusion  matrices.  Figure  4  shows 
a  stylized  confusion  matrix  for  algorithm  identification  per¬ 
formance  against  ground  combat  vehicles.  A  detailed  discus¬ 
sion  of  the  considerations  for  the  measurement  of  confusion 
matrices  is  given  in  Ref.  22. 

Time  to  acquire  the  targets  within  the  sensor  field  of  view 
and  field  of  regard  is  the  real  benefit  for  use  of  Ai/ATR. 
Measured  AiTR  timeline  performance  when  compared  to 
human- alone  performance  has  been  shown  to  realize  an  order 
of  magnitude  reduction  (see  Ref.  23). 

3.2  Imagery  Data  Set 

In  order  to  carry  out  a  performance  evaluation  of  an  ATR 
algorithm  for  imaging  sensors  in  terms  of  ROC  curves  and 
confusion  matrices,  it  is  necessary  to  have  a  relevant  im¬ 
agery  data  set.  For  example,  if  the  desired  ATR  algorithm 
is  for  a  tank  fire  control  mission  and  the  main  sensor  is  the 
IR  gunner’s  primary  sight,  then,  IR  imagery  of  a  scene  with 
threat  targets,  for  all  variants  and  at  all  poses  and  orienta¬ 
tions,  is  required  in  all  relevant  backgrounds.  For  air-to-air 
combat  and  surface  naval  warfare,  a  similar  set  of  the  rele¬ 
vant  targets  is  required.  Another  scenario  of  close  air  support 
requires  a  similar  set  of  ground  targets,  but  with  another  set 
of  variables  for  target  reticulation  (i.e.,  which  direction  the 
gun  is  pointing).  The  issue  of  the  target  signature  set  has 
been  well  documented.24  These  requirements  then  imply  the 
generation  of  a  library  of  IR  imagery,  which  is,  typically, 
classified.  Not  only  is  the  imagery  classified,  but  the  sensor 
parameters  for  the  gathering  device  are  classified.  This  has 
been  a  significant  issue  for  military  Ai/ATR  development. 
The  necessary  classified  imagery  is  easy  enough  to  obtain 
and  the  service  laboratories  do  this  extensively.  However, 
the  imagery  set  can  be  quite  extensive  and  cannot  be  re¬ 
leased  to  noncleared  organizations.  For  example,  university 
researchers  with  noncitizen  students  cannot  get  the  neces¬ 
sary  imagery  to  design  the  algorithm  and  test  it.  Instead,  we 
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have  had  to  live  with  algorithms  developed  against  civilian 
vehicles  on  U.S.  highways.  Extrapolation  to  realistic  military 
scenarios  is  extremely  difficult,  if  not  impossible,  and  there 
can  be  no  free  back-and-forth  interaction  for  the  development 
of  Ai/ATR  among  the  government  labs,  defense  industry,  and 
academia. 

The  issue  of  the  canonical  imagery  data  set  for  perfor¬ 
mance  quantification  is  so  severe  that  a  special  DoD  com¬ 
mittee  has  been  chartered  to  define  problem  sets.  This  is  the 
ATR  Working  Group.25  Sanctioned  problem  sets  permit  the 
establishment  of  universal  metrics  to  assess  algorithm  per¬ 
formance  collectively  and  scientifically.  The  same  problem 
set  can  be  used  to  test  any  number  of  candidate  algorithms 
and  permits  quantifiable  difference  measurements  across  all 
candidates. 

The  problem  of  an  unclassified,  canonical  stimulus  set 
of  imagery  has  been  somewhat  addressed  in  the  last  several 
years  with  the  release  of  a  specially  gathered,  unclassified  IR 
imagery  set  for  AI/ATR  algorithm  development.  Unclassi¬ 
fied  sensors  were  used  to  obtain  MWIR  and  visible  imagery 
of  tactical  vehicles,  civilian  vehicles,  and  people  in  realistic 
tactical  scenes  with  corresponding  ground-truth  and  meteo¬ 
rological  data.  This  >300-GB  imagery  data  set  is  available 
by  contacting  SENSIAC26  at  a  cost  of  several  hundred  dol¬ 
lars.  Although  this  is  only  one  data  set  for  one  scenario,  it 
is  a  significant  step  toward  enabling  the  injection  of  a  wider 
academic  community  into  the  research  on  Ai/ATR. 

Once  stimulus  data  have  been  obtained,  the  data  must  be 
separated  into  a  training  set  and  a  test  set.  The  algorithm 
must  be  trained  on  a  relevant  set  of  imagery  that  relates  to 
the  mission  scenario  and  will  expose  the  algorithm  to  all  the 
variables  that  it  will  be  expected  to  handle.  This  means  the 
target  set  must  be  appropriate,  including  not  only  the  target 
set,  but  also  the  variants  of  the  members  of  the  set  to  envi¬ 
ronments,  operational  conditions,  and  backgrounds.  Various 
environments  are  needed  because  the  same  vehicle  can  ap¬ 
pear  differently  from  day  to  night,  season,  and  even  time  of 
day.  Think  of  especially  diurnal  and  seasonal  variations  in  the 
infrared  spectral  regions.  A  set  of  relevant  operational  condi¬ 
tions  is  needed  because  the  target  signature  will  vary  whether 
it  is  stationary,  moving,  firing,  if  it  has  been  rained  on,  cam¬ 
ouflage,  etc.  Different  backgrounds  present  a  variety  of  con- 
fusers  and  competitive  false  targets.  Because  it  is  impossible 
to  sample  all  the  infinitely  large  sample  sets  of  conditions, 
a  judicious  set  of  samples  must  be  chosen  that  represents  a 
sufficient  expanse  of  the  complete  target/background  set  and 
that  gives  some  confidence  that  the  entire  space  has  been 
faithfully  represented.  Agreement  on  this  point  is  usually  a 
major  bone  of  contention  between  a  government  evaluator 
and  an  industrial  developer. 

In  order  to  use  the  chosen  data  set  of  stimuli  for  Ai/ATR 
testing,  the  data  set  must  include  ground-truth  data.  That  is, 
the  location  of  legitimate  targets  in  the  scene  must  be  deter¬ 
mined  digitally  in  order  to  score  the  ATR  annotations.  Usu¬ 
ally,  an  error  box  is  associated  around  the  true  target,  where 
an  ATR  annotation  is  accepted  as  a  true  detection.  This  can 
be  a  tedious  process,  even  with  modem  computer  software. 
There  are  only  a  few  laboratories  in  the  Defense  Department 
that  routinely  carry  out  these  ground-truth  and  score  Ai/ATR 
algorithms  for  the  community,  source  selections,  and  mis¬ 
sion  accomplishment.  Two  of  these  are  the  Army’s  Night 
Vision  &  Electronics  Sensors  Directorate  (NVESD)  and  the 
Air  Force’s  Wright  Patterson  Research  Laboratories. 


3.3  Simulated  Imagery 

One  might  consider  the  utility  of  generating  simulated  im¬ 
agery  of  tactical  scenes  as  a  surrogate  for  realistic  test  im¬ 
agery  that  could  negate  testing  against  all  the  possible  real- 
world  scenes.  This  concept  has  been  investigated  and  shown 
to  be  problematic.27  Testing  with  simulated  imagery  has 
shown  that,  although  the  detection  probabilities  are  quite 
comparable  between  synthetic  and  realistic  imagery,  the  false 
alarm  rate  (FAR)  was  much  different  with  simulated  im¬ 
agery  compared  to  the  realistic  image  inputs.  The  hypothesis 
suggested  for  this  difference  is  the  differences  in  real  and 
simulated  backgrounds,  where  false  alarms  are  generated. 
The  synthetic  image  generators  evidently  produce  different 
target  confusing  regions  in  the  background  from  real  back¬ 
grounds.  Additionally,  the  synthetic  noise  generation  can  be 
significantly  different  from  true  sensor  noise  characteristics. 
It  is  important  to  characterize  the  sensor  noise  characteristics 
extremely  well  to  simulate  sensor  realistic  noise. 

3.4  Receiver  Operator  Characteristics  Curve 
Determination 

In  order  to  test  an  Ai/ATR  algorithm  for  its  detection  perfor¬ 
mance,  as  determined  by  its  ROC  curve,  the  requisite  data 
set  from  a  relevant  imager  viewing  a  relevant  operational 
scene  needs  to  be  digitized  and  fed  into  the  algorithm  un¬ 
der  test.  The  single-frame  processors  process  the  imagery, 
frame  by  frame,  and  nominate  image  sections  as  targets. 
Usually,  a  recognition  decision  is  also  reported.  The  anno¬ 
tations  are  scored  by  comparison  to  the  ground  truth  and  an 
ROC  curve  generated  for  that  data  set  (Fig.  5).  Slightly  dif¬ 
ferent  approaches  to  scoring  and  evaluation  are  required  for 
multiframe  processors  and  those  designed  to  look  for  moving 
targets. 

This  Fig.  5  curve  is  generated  by  feeding  the  digitized 
image  frames  into  the  computer  that  is  hosting  the  ATR  al¬ 
gorithm.  As  each  detection  decision  is  made,  its  location  is 
matched  to  the  ground- truth  data  file  for  the  real  targets.  If 
the  ATR  algorithm  declaration  is  within  an  established  error 
region  of  the  real  target,  then  it  is  recorded  as  a  true  detec¬ 
tion.  If  the  declaration  is  not  in  a  real  target  region,  then  it 
is  recorded  as  a  false  alarm.  The  higher  the  curve  is  and  the 
more  to  the  left,  the  better  the  performance  is.  See  Fig.  3  for 
a  set  of  curves  showing  performance  improvements  as  the 
curves  move  higher  to  the  left.  The  generated  curve  is  unique 
for  the  processor,  scene,  training  set,  and  test  set.  Here  lies  the 


Fig.  5  Generation  of  a  ROC  curve. 
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Fig.  6  Annotated  images  from  Fig.  4.  Top  two  images  have  a  sedan 
and  pickup  truck,  on  left  and  right,  respectively.  Bottom  two  images 
(from  left  to  right)  show  an  SUV,  pickup  truck,  van,  and  SUV. 


bane  of  the  ATR  technology.  Any  change  in  target  condition, 
location,  signature,  background,  processor/algorithm  charac¬ 
teristics,  etc.,  can  cause  a  different  call  for  the  annotation.  The 
hope  of  the  technology  is  that  sufficiently  robust  algorithms 
can  still  correctly  acquire  the  targets  in  backgrounds  to  give 
the  operator  enough  confidence  to  use  it  with  the  commen¬ 
surate  improvement  in  combat  effectiveness.  The  reliability 
with  which  ATR  can  do  this  in  military  applications  is  gener¬ 
ally  not  acceptable  for  all  but  a  few  situations.  The  veracity 
of  this  statement  is  difficult  to  substantiate  without  reference 
to  classified  literature. 

The  method  of  measurement  of  how  an  ATR  algorithm 
performs  (i.e.,  the  determination  of  the  ROC  curve)  is  crucial 
to  understanding  what  we  expect  an  ATR  to  do  and  estab¬ 
lishes  the  database  characteristics  to  evaluate  it.  The  method 
described  here  is  the  method  developed  at  the  U.S.  Army’s 
CERDEC  NVESD  by  a  team  led  by  Carl  Hoover  and  Clare 
Walters.28,29  Other  evaluations  of  ATR  ROC  curves  are  sim¬ 
ilar.  The  first  thing  to  recognize  is  that  most  ATR  algorithms 
today,  and  that  have  been  tested  in  the  laboratory,  are  based 
on  a  CFAR  parameter.  That  is,  the  determination  of  the  tar¬ 
get  rests  on  the  setting  of  a  threshold  for  the  number  of  false 
alarms  will  be  tolerated.  Whatever  the  algorithm  parameters 
that  are  calculated  from  the  processing  of  the  digital  image, 
a  confidence  is  established  as  a  function  of  that  set  of  pa¬ 
rameters  based  on  training  against  a  relevant  image  set.  The 
threshold  parameter  can  then  be  chosen  based  on  the  FAR 
and  detection  probability  ( P <j)  that  are  desired.  P&  and  FAR 
are  set  based  on  the  operational  requirements. 

Once  the  threshold  CFAR  is  chosen,  the  ATR  algorithm 
can  be  tested  against  a  test/evaluation  set  of  imagery  that 
is  different  from  the  training  set.  The  representation  of  the 
training  data  set  to  the  test/operational  set  is  always  a  mat¬ 
ter  of  intense  discussion  between  algorithm  developer  and 
service  evaluator. 

The  CFAR  being  selected,  the  algorithm  can  be  run  against 
the  test  imagery  set.  All  annotations  of  the  algorithm,  real  tar¬ 
get  or  false  alarm,  can  then  be  ordered  against  a  confidence. 
Critical  to  this  process  is  the  relating  of  the  algorithm’s  anno¬ 


tations  to  real  targets  based  on  the  image  ground  truth.  This 
is  another  critical  component  of  the  evaluation  process.  How 
to  establish/score  an  annotation  as  a  true  target  hit  or  a  false 
alarm  is  critical.  However,  software  has  been  developed  to 
do  this. 

The  ROC  curve  shown  in  Fig.  5  can  be  now  generated. 
Starting  with  the  highest  confidence  value,  a  point  is  estab¬ 
lished  on  the  Rd  versus  FAR  axis.  As  the  computer  runs  down 
the  threshold  confidence  values,  from  highest  to  lowest,  true 
target  detections  and  false  alarms  are  plotted.  As  the  number 
of  annotations  is  increased,  a  smooth  curve  similar  to  Fig.  3 
is  generated. 

The  importance  of  a  set  of  training  imagery  as  repre¬ 
sentative  of  the  operational  situation  is  another  crucial  con¬ 
sideration.  Careful,  judicious  choices  must  be  made  by  the 
evaluator  to  ensure  all  real  targets  are  deployed  in  tactically 
relevant  scenes.  The  algorithm  must  be  stressed  such  that 
the  war  fighter  has  confidence  in  its  use.  Conversely,  the 
algorithm  developer  must  understand  the  scenario  in  order 
to  design  the  algorithm  to  go  after  the  tactically  significant 
artifacts  in  the  scene. 

It  is  obvious  that  range  to  the  target  can  be  very  important 
information  for  the  processor  to  help  size  the  window  of  in¬ 
vestigation.  Estimations  on  range  to  points  in  the  image  must 
be  made.  If  the  weapon  system  has  an  integral  rangefinder, 
then  range  is  given  to  the  algorithm  under  test.  If  not,  then 
the  algorithm  usually  uses  some  technique  programed,  such 
as  a  flat-earth  technique,  to  estimate  range  which  can  intro¬ 
duce  significant  errors  into  the  range  value  and,  consequently, 
the  target  size.  Knowledge  of  range  in  the  scene  can  enable 
a  great  enhancement  to  algorithm  performance.  Other  ap¬ 
proaches,  such  as  rescaling  selected  regions  in  the  image  to 
a  fixed  range,  have  also  been  used. 

The  ROC  curve  is  generated  as  the  probability  of  detection 
on  the  vertical  axis  and  false  alarms  on  the  horizontal  axis. 
Usually,  FAR  is  in  units  of  false  alarms  per  square  degree 
for  ground  combat  and  false  alarms  per  square  kilometer  for 
airborne  sensors.  This  is  because  on  the  ground,  the  ground 
covered  by  the  sensor  field  of  view  goes  from  very  close 
to  the  horizon.  Typically,  this  experimental  ROC  curve  is 
compared  to  a  specification  ROC  curve  based  on  a  weapon 
system  requirement  to  determine  if  the  algorithm  meets  the 
performance  requirement. 


4  What’s  the  Problem? 

The  extreme  difficulty  of  the  military  target  acquisition 
task  has  thwarted  progress  in  the  development  of  image- 
processing  techniques  that  enable  an  acceptable  level  of  per¬ 
formance  for  the  war  fighter  in  harm’s  way.  Aided  target 
recognition  in  relatively  benign  environments,  such  as  low 
clutter,  has  been  shown  to  perform  at  a  useful  level.  However, 
medium  to  highly  cluttered  backgrounds  introduce  an  unac¬ 
ceptable  amount  of  false  alarms,  whereas  target  variability 
and  operational  environmental  conditions  also  have  a  sig¬ 
nificant  degrading  effect.  Higher  level  discriminations,  such 
as  target  recognition  and  identification,  fall  off  significantly 
compared  to  detection.  Previous  technical  articles  on  the  per¬ 
formance  of  military  Ai/ATR  technology  can  be  found  in  the 
literature.30-33  An  excellent  synopsis  of  types  of  algorithms 
is  given  by  Bhanu. 
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4.1  False  Alarms 

A  primary  operational  limitation  for  ATR  is  the  false  alarm 
problem  driven  by  objects  in  the  scene  that  can  be  confused 
with  targets  (confusers)  and  background  clutter  that  causes 
the  operator  to  spend  excessive  time  interrogating  them.  This 
problem  is  exacerbated  by  sensors  that  are  not  visible  imagers 
and  do  not  have  the  resolution  of  visible  imagers  and  are  not 
familiar  to  normal  human  vision,  such  as  thermal  imaging. 
Additionally,  in  tactical  situations  the  threat  can  introduce 
countermeasures  such  as  camouflage  and  defilade.  The  abil¬ 
ity  of  humans  to  discern  targets  is  still  significantly  greater 
than  that  of  electronic  processing  algorithms.35  However, 
humans  cannot  process  the  available  information  and  make 
decisions  at  a  fast  enough  rate  to  engage  targets  effectively. 
The  electronics  can  process  the  information  at  a  much  faster 
rate,  and  that  is  why  the  military  continues  to  pursue  an  effec¬ 
tive  Ai/ATR  technology  for  military  combat  requirements.  It 
has  been  shown  that  timelines  for  target  acquisition  can  be 
reduced  on  the  order  of  a  magnitude  using  Ai/ATR  with  a 
human  over  human  only.35 

Although  there  have  been  some  successes  in  military 
Ai/ATR  in  the  services,  there  have  been  some  significant  lim¬ 
itations  to  the  desired  performance.  The  main  challenge  that 
has  been  identified  for  military  Ai/ATR  is  the  level  of  false 
alarms  for  detection  encountered  in  real  environments.  The 
level  of  false  alarms  in  a  tactical  ground-to-ground  scenario 
can  be  sufficient  that  the  operator  will  turn  off  the  AiTR/ATR. 
Besides  increasing  the  time  to  acquire  the  real  target  and  the 
frustration,  false  alarms  can  be  dangerous.  Firing  at  a  false 
target  will  give  the  position  of  the  firing  platform  away  and 
make  it  a  target  of  counterfire. 

4.2  Clutter 

A  primary  limitation  of  ATR  technology  is  lack  of  an  un¬ 
derstanding  of  clutter  and  a  reliable  clutter  model  that  can 
quantify  the  scene  difficulty.  This  difficulty  is  compounded 
by  the  obvious  dependence  of  scene  difficulty  on  the  target 
of  interest.  Clutter  that  confuses  detection  of  a  vehicle  is 
different  when  attempting  to  detect  personnel.  Clutter  mod¬ 
els  that  are  more  sophisticated  than  simple  signal-to-clutter 
models  representative  of  human  performance  models  are  re¬ 
quired.  Examples  of  approaches  to  quantifying  clutter,  such 
as  Lanterman  et  al.36  appear  in  the  literature,  and  there  are 
information  theoretic  approaches.37,38  The  ultimate  clutter 
metric  must  surely  contain  some  target  conspicuity  factor.  A 
clutter  metric  that  is  primarily  a  function  of  signal-to-noise 
ratio  or  signal  to  clutter  will  not  show  the  true  dependency 
of  performance  on  real-world  clutter.  Further  discussion  of 
clutter  modeling39  can  be  found  from  research  funded  by 
the  Army  at  the  Center  for  Image  Sciences.40  Besides  clut¬ 
ter,  camouflage  and  signature  disrupters  can  also  degrade 
Ai/ATR  performance,  which  is  another  major  reason  Ai/ATR 
has  been  very  difficult  for  military  applications. 

4.3  Target  Variability 

Another  performance  limiter  to  aided  target  acquisition  is  tar¬ 
get  variability  under  operational  conditions.  The  target  can 
present  all  aspects  and  can  have  different  signatures  under 
different  environmental,  operational,  and  background  con¬ 
ditions.  Camouflage,  concealment,  and  deception  (decoys) 
increase  the  target  dimensional  space  significantly. 


These  variables  plague  Ai/ATR ’s  in  all  the  services  and 
set  the  most  severe  limitations  of  performance  today  for  this 
technology.  Probabilities  of  higher  order  than  detection  per¬ 
formance  degrade  as  the  sophistication  increases.  The  limi¬ 
tations  imposed  by  false  alarms  and  variable  environmental 
conditions  might  imply  that  the  best  we  can  hope  for  is  aided 
target  recognition.  Full  automatic  target  recognition  may  be 
unattainable,  or  at  best,  take  a  long  time  to  mature. 

5  What  Works? 

Years  of  research  and  development  coupled  with  constant 
test-fix-test  cycles  for  specific  ad  hoc  mission  targeting  ap¬ 
plications  have  resulted  in  the  level  of  maturity  for  Ai/ATR 
that  we  have  today.  It  is  impossible  to  give  precise  data  rep¬ 
resentative  of  the  state-of-the-art  performance  today  in  an 
unclassified  forum.  Shape-based  approaches  to  ground-to- 
ground  stationary  target  indication  have  shown  to  give  useful 
performance  in  low  to  medium  clutter.  The  clutter  level  is 
subjective  because  there  has  been  no  clutter  metric  to  accu¬ 
rately  determine  task  difficulty.  One  of  the  greatest  unmet 
challenges  of  this  technology  is  a  reliable  clutter  metric. 
Shape-based  algorithm  suites  are  typically  template  match¬ 
ing  schemes  or  comparison  of  target  images  to  stored  target 
models. 

Ai/ATR  from  airborne  sensor  platforms  has  been  shown 
to  be  somewhat  better  performing  compared  to  ground  to 
ground.  This  is  because  the  clutter  from  the  air  is  not  as 
competitive  as  opposed  to  the  ground  scenario.  There  is  also 
an  advantage  of  recognizing  an  overhead  aspect,  which  is 
not  as  complex  compared  to  ground  to  ground,  and  is  not 
as  easily  confused  with  overhead  views  of  ground  clutter.  In 
addition,  typically,  aircraft  altitude  is  known,  which  makes 
range  estimation  easier  and  the  sensor  field  of  view  does 
not  extend  to  infinity  as  it  does  on  the  ground.  Atmospheric 
attenuation  from  the  air  tends  to  be  much  less  that  for  ground- 
to-ground  lines  of  sight. 

The  current  conflicts  in  southwest  Asia  have  refocused  the 
important  application  of  Ai/ATR  from  fire  control  to  persis¬ 
tent  surveillance  missions.  Previously,  the  focus  of  the  tech¬ 
nology  was  on  the  acquisition  of  targets  for  the  Comanche 
helicopter  fire  control.  The  paramount  application  today  of 
persistent  surveillance  to  detect  hostile  activity  is  potentially 
a  somewhat  easier  task.  The  approaches  developed  for  PS  that 
have  had  some  success  are  change  detection  and  MTI.  Detec¬ 
tion  of  new  targets  and  missing  targets  in  images  compared 
to  previous  images  of  the  same  location  has  had  some  suc¬ 
cess.  MTI  from  the  air  and  from  stationary  ground  platforms 
also  has  been  shown.  However,  MTI  from  moving  ground 
platforms  has  problems  with  optical  flow  and  confusion  of 
whether  the  motion  is  target  or  platform  induced. 

Two  more  sophisticated  sensor  approaches  that  offer  more 
image  features  on  which  a  decision  can  be  based  are  MS/HSI 
and  3-D  LADAR.  However,  these  system  concepts  require 
increased  system  complexity  and  cost.  Both  add  another  or¬ 
thogonal  dimension  to  the  decision  space.  MS/HSI  uses  the 
unique  spectral  content  of  objects  as  the  discrimination  met¬ 
ric  between  backgrounds  and  targets.  MS/HSI  sensors  can  be 
used  in  a  search  mode  for  target  detection,  however,  presently 
they  are  day-only  operation  and  are  large  expensive  sys¬ 
tems.  3-D  LADAR-based  imagers  add  the  depth  dimension 
to  the  image  as  another  target  discriminator.  The  system  im¬ 
plications  downside  of  3-D  LADARs  is  that  they  have  high 
power  requirements,  are  not  useful  for  searching,  and  laser 
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power  requirements  constrain  the  practical  range  that  can  be 
realized. 

6  Opportunities  for  Advances  in  Aided  and 
Automatic  Target  Recognition  Performance 

There  are  many  potential  applications  throughout  the  ser¬ 
vices  for  reliable  Ai/ATRs.  However,  except  for  some  small 
number  of  applications,  the  attainable  level  of  performance 
must  be  significantly  improved  to  handle  all  the  false  alarms 
and  environmental  variables  that  are  encountered  in  military 
scenarios.  We  cannot  look  to  improvements  in  the  imaging 
sensors  being  used  as  the  front  ends  for  Ai/ATR’s.  They  are 
already  pushing  the  limits  of  physics.  Performance  improve¬ 
ments  must  be  in  the  ATR  algorithm  concepts  or  in  the  way 
AiTR  annotations  are  presented  to  the  observer  in  order  to 
engage  more  of  the  observer’s  intellectual  image  process¬ 
ing.  New  techniques  for  extracting  objects  from  complex 
backgrounds  are  needed.  These  new  techniques  would  be 
expected  to  originate  in  academia  and  could  form  the  basis 
for  a  new  springboard  for  image  science  in  the  pursuit  of 
useful  military  Ai/ATR  capability. 

Candidate  starting  points  for  military  relevant  image  sci¬ 
ence  approaches  to  Ai/ATR  are  pattern  theoretic  approaches 
to  understanding  complex  scenes41  or  a  recognition-by¬ 
parts42-44  approach.  The  recognition  of  a  tactical,  canoni¬ 
cal  geon  in  an  image  that  is  partially  obscured  could  im¬ 
ply  the  presence  of  a  target  of  interest.  Eye-brain  research 
could  lead  to  more  understanding  of  what  needs  to  be  ex¬ 
tracted  from  a  tactical  image  and  presented  to  the  opera¬ 
tor  for  enhanced  recognition  ability.  Other  nonimage-based 
techniques,  such  as  category  theory  45  hierarchical  systems,46 
and  gradient  index  flow,47  are  possible  formalisms  that  might 
be  applied  to  help  the  Ai/ATR  problem.  Any  improvements 
realized  in  ATR  performance  would  be  due  to  algorithm 
improvements  in  software  as  opposed  to  improving  sensor 
hardware. 

A  system-level  approach  to  increasing  Ai/ATR  perfor¬ 
mance  is  to  take  advantage  of  tactical  networks  on  the  battle¬ 
field.  There  is  a  plethora  of  imaging  and  nonimaging  sensors 
on  the  battlefield  that  are  being  networked  together  for  trans¬ 
mission  of  information,  such  as  targets,  across  platforms. 
At  each  platform,  the  ATR  could  take  the  off-board  data 
and  build  a  case  for  each,  indicated  onboard  detection  as  to 
target  or  false  alarm.  For  example,  an  unattended  acoustic 
sensor  could  supply  information  to  a  tank  computer  that  an 
image  annotation  was,  in  fact,  another  tank.  An  example  of 
an  existing  Army  platform  that  is  designed  for  this  kind  of 
decision  making  is  distributed  common  ground  station.  This 
approach  to  low  false  alarm  Ai/ATR  would  be  limited  by  the 
bandwidth  limitations  of  the  tactical  network.  This  kind  of 
approach  stresses  the  tactical  network  capabilities  with  some 
sophistication  improvement  in  the  algorithm  software  and  no 
impact  on  the  sensor. 

7  Summary  and  Conclusions 

Ai/ATR  can  provide  significant  enhancement  to  military 
weapons  platforms  over  human-only  performance.  AiTR  can 
provide  enhancements  to  the  weapons  operator  or  intelli¬ 
gence  analyst  for  fire  control,  surveillance,  reconnaissance, 
intelligence,  persistent  surveillance,  and  situational  aware¬ 
ness.  ATR  can  provide  fully  autonomous  target  engagements, 
such  as  for  missile  seekers.  Present  use  of  Ai/ATR  by  the 
military  has  been  limited  due  to  the  level  of  difficulty  of  the 

Optical  Engineering  0/ 


automated  task.  However,  the  technology  is  being  pursued  in 
academia,  industry,  and  government  laboratories. 

Most  prevalent  state-of-the-art  Ai/ATR  algorithms  today 
are  shape  based,  in  which  performance  degrades  significantly 
under  realistic  operational  conditions,  such  as  clutter,  vari¬ 
able  target  set,  and  variability.  Ground-to-ground  degrada¬ 
tions  have  been  shown  to  be  more  severe  than  airborne  tar¬ 
get  acquisition.  Enhancements  to  shape-based  approaches 
potentially  can  provide  a  more  robust  capability.  Temporal 
techniques,  such  as  change  detection  and  MTI  are  examples. 
Sensor-level  improvements  are  MS/HSI  in  wide-area  search- 
and  camouflage-detection  and  3-D  LADAR  for  higher  level 
recognition  and  identification. 

Aided  target  recognition  will  mature  more  rapidly  than 
ATR.  By  off-loading  the  higher  level  decisions  to  the  human, 
the  value  of  the  AiTR  will  be  to  potentially  provide  an  order 
of  magnitude  improvement  in  target  acquisition  times.  This  is 
significant  in  war  fighting  with  increasingly  more  importance 
in  urban  warfare. 
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